666 research outputs found

    A Probabilistic Model for Malicious User and Rumor Detection on Social Media

    Get PDF
    Rumor detection in recent years has emerged as an important research topic, as fake news on social media now has more significant impacts on people\u27s lives, especially during complex and controversial events. Most existing rumor detection techniques, however, only provide shallow analyses of users who propagate rumors. In this paper, we propose a probabilistic model that describes user maliciousness with a two-sided perception of rumors and true stories. We model not only the behavior of retweeting rumors, but also the intention. We propose learning algorithms for discovering latent attributes and detecting rumors based on such attributes, supposedly more effectively when the stories involve retweets with mixed intentions. Using real-world rumor datasets, we show that our approach can outperform existing methods in detecting rumors, especially for more confusing stories. We also show that our approach can capture malicious users more effectively

    Updated Data Dissemination for Applications with Time Constraints in Mobile Ad Hoc Networks

    Get PDF
    In our previous work, we proposed few updated data dissemination methods to refresh old replicas efficiently in mobile ad hoc networks. These methods disseminate updated data items every time when owners of original data items update the items or every time two mobile hosts are newly connected with each other and this causes heavy traffic in the entire network. In this paper, we assume applications that periodically execute read operations with strict deadlines to data items and propose few alternative updated data dissemination methods. These methods reduces the traffic for data dissemination while keeping a high success ratio for read operations

    Consistency Management Among Replicas in Peer-to-Peer Mobile Ad Hoc Networks

    Get PDF
    Recent advances in wireless communication along with peer-to-peer (P2P) paradigm have led to increasing interest in P2P mobile ad hoc networks. In this paper, we assume an environment where each mobile peer accesses data items held by other peers which are connected by a mobile ad hoc network. Since peers\u27 mobility causes frequent network partitions, replicas of a data item may be inconsistent due to write operations performed by mobile peers. In such an environment, the global consistency of data items is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time need to be investigated. This paper attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. We report simulation results to investigate the characteristics of these consistency protocols in a P2P wireless ad hoc network environment and their relationship with the quorum sizes

    Data Replication for Improving Data Accessibility in Ad Hoc Networks

    Get PDF
    In ad hoc networks, due to frequent network partition, data accessibility is lower than that in conventional fixed networks. In this paper, we solve this problem by replicating data items on mobile hosts. First, we propose three replica allocation methods assuming that each data item is not updated. In these three methods, we take into account the access frequency from mobile hosts to each data item and the status of the network connection. Then, we extend the proposed methods by considering aperiodic updates and integrating user profiles consisting of mobile users\u27\u27 schedules, access behavior, and read/write patterns. We also show the results of simulation experiments regarding the performance evaluation of our proposed method

    Consistency Management Strategies for Data Replication in Mobile Ad Hoc Networks

    Get PDF
    In a mobile ad hoc network, data replication drastically improves data availability. However, since mobile hosts\u27 mobility causes frequent network partitioning, consistency management of data operations on replicas becomes a crucial issue. In such an environment, the global consistency of data operations on replicas is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time need to be investigated. This paper attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. We report simulation results to investigate the characteristics of these consistency protocols in a mobile ad hoc network

    Identifying the most interactive object in spatial databases

    Full text link
    This paper investigates a new query, called an MIO query, that retrieves the Most Interactive Object in a given spatial dataset. Consider that an object consists of many spatial points. Given a distance threshold, we say that two objects interact with each other if they have a pair of points whose distance is within the threshold. An MIO query outputs the object that interacts with other objects the most, and it is useful for analytical applications e.g., neuroscience and trajectory databases. The MIO query processing problem is challenging: a nested loop algorithm is computationally inefficient and a theoretical algorithm is computationally efficient but incurs a quadratic space cost. Our solution efficiently processes MIO queries with a novel index, BIGrid (a hybrid index of compressed Bitset, Inverted list, and spatial Grid structures), with a practical memory cost. Furthermore, our solution is designed so that previous query results and multi-core environments can be exploited to accelerate query processing efficiency. Our experiments on synthetic and real datasets demonstrate the efficiency of our solution.Amagata D., Hara T.. Identifying the most interactive object in spatial databases. Proceedings - International Conference on Data Engineering 2019-April, 1286 (2019); https://doi.org/10.1109/ICDE.2019.00117

    Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

    Full text link
    Clustering multi-dimensional points is a fundamental task in many fields, and density-based clustering supports many applications as it can discover clusters of arbitrary shapes. This paper addresses the problem of Density-Peaks Clustering (DPC), a recently proposed density-based clustering framework. Although DPC already has many applications, its straightforward implementation incurs a quadratic time computation to the number of points in a given dataset, thereby does not scale to large datasets. To enable DPC on large datasets, we propose efficient algorithms for DPC. Specifically, we propose an exact algorithm, Ex-DPC, and two approximation algorithms, Approx-DPC and S-Approx-DPC. Under a reasonable assumption about a DPC parameter, our algorithms are sub-quadratic, i.e., break the quadratic barrier. Besides, Approx-DPC does not require any additional parameters and can return the same cluster centers as those of Ex-DPC, rendering an accurate clustering result. S-Approx-DPC requires an approximation parameter but can speed up its computational efficiency. We further present that their efficiencies can be accelerated by leveraging multicore processing. We conduct extensive experiments using synthetic and real datasets, and our experimental results demonstrate that our algorithms are efficient, scalable, and accurate

    Reverse maximum inner product search: How to efficiently find users Who would like to buy my item?

    Full text link
    The MIPS (maximum inner product search), which finds the item with the highest inner product with a given query user, is an essential problem in the recommendation field. It is usual that e-commerce companies face situations where they want to promote and sell new or discounted items. In these situations, we have to consider a question: who are interested in the items and how to find them? This paper answers this question by addressing a new problem called reverse maximum inner product search (reverse MIPS). Given a query vector and two sets of vectors (user vectors and item vectors), the problem of reverse MIPS finds a set of user vectors whose inner product with the query vector is the maximum among the query and item vectors. Although the importance of this problem is clear, its straightforward implementation incurs a computationally expensive cost. We therefore propose Simpfer, a simple, fast, and exact algorithm for reverse MIPS. In an offline phase, Simpfer builds a simple index that maintains a lower-bound of the maximum inner product. By exploiting this index, Simpfer judges whether the query vector can have the maximum inner product or not, for a given user vector, in a constant time. Besides, our index enables filtering user vectors, which cannot have the maximum inner product with the query vector, in a batch. We theoretically demonstrate that Simpfer outperforms baselines employing state-of-the-art MIPS techniques. Furthermore, our extensive experiments on real datasets show that Simpfer is about 500-8000 times faster than the baselines

    Correlation Set Discovery on Time-Series Data

    Full text link
    Time-series data analysis is essential in many modern applications, such as financial markets, sensor networks, and data centers, and correlation discovery is a core technique for the analysis. In this paper, we address a novel problem that computes a k-sized time-series dataset where the minimum Pearson correlation of any two time-series in the set is maximized. This problem discovers a group of time-series, which are highly correlated with each other, from a given time-series dataset without any prior knowledge, thus helps many analytical applications. We show that this problem is NP-hard, and design an approximate heuristic solution that provides a high quality result with fast response time. Extensive experiments on real and synthetic datasets verify the efficiency, effectiveness, and scalability of our solution.This version of the contribution has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-030-27618-8_21
    corecore